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ABSTRACT 

We  present  a  framework  for  the  joint  processing  of  multimodal  data 
such  as  audio-video  data  streams.  We  first  consider  the  problem 
of  estimating  the  joint  distribution  of  statistically  dependent  multi¬ 
modal  random  variables.  We  discuss  the  issues  involved  and  provide 
a  copula  based  solution.  Application  of  this  approach  to  solve  a  mul¬ 
tisensor  fusion  problem  for  the  detection  of  a  random  event  is  also 
discussed. 

Index  Terms —  Copula  theory,  Multisensor  fusion,  Hypothesis 
testing,  Kullback-Leibler  distance.  Statistical  dependence 

1.  INTRODUCTION 

Statistical  signal  processing  tasks  such  as  detection,  estimation  and 
tracking  always  require  complete  specification  of  the  joint  probabil¬ 
ity  distribution  of  the  observed  samples.  However,  in  many  cases, 
the  derivation  of  the  joint  probability  density  function  (PDF)  be¬ 
comes  mathematically  intractable.  In  problems  such  as  multimodal 
signal  processing,  random  variables  associated  with  each  modality 
may  follow  probability  distributions  that  are  different  from  one  an¬ 
other.  This  is  due  to  several  physical  differences  such  as  in  their 
dimensionality,  support  and  sampling  rate.  Moreover,  in  most  ap¬ 
plications,  the  signals  share  a  common  source  and  thus  may  exhibit 
statistical  dependency.  Consider,  for  example,  an  acoustic  sensor 
and  a  video  camera  monitoring  a  region  for  trespassers.  Presence  of 
a  target  may  result  in  an  increase  in  both  the  acoustic  energy  and  the 
pixel  intensities  of  the  images  acquired  by  the  video  camera.  Both 
sensors  provide  information  about  the  same  event  (and  hence  are 
statistically  dependent)  but  in  different  ’domains'.  In  this  case,  it  is 
highly  likely  that  the  acoustic  features  and  the  pixel  intensities  will 
not  follow  the  same  probability  distribution.  We  are  thus  faced  with 
two  challenges  when  modeling  the  joint  distribution  of  random  vari¬ 
ables  corresponding  to  multimodal  data, 

•  How  do  we  characterize  the  inter-modal  dependence  struc¬ 
ture? 

•  How  do  we  model  the  joint  distribution  between  statistically 
dependent  multimodal  measurements  when  the  underlying 
marginals  follow  disparate  distributions? 

We  discuss  parametric  modeling  of  multimodal  data  in  Section 
2.  We  show  here  how  copula  functions  provide  an  approach  to  model 
the  joint  multimodal  statistics.  We  formulate  a  binary  hypotheses 
testing  problem  in  Section  3  and  present  a  multisensor  detection  ex¬ 
ample  in  Section  4. 
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2.  MODELING 

The  acoustic-video  sensor  example  given  above  motivates  the  fol¬ 
lowing  general  definition  for  heterogeneous  random  vectors  that 
would  arise  with  multimodal  signals. 

Definition  1  A  random  vector  Z  =  {Zn}^=1  governing  the  joint 
statistics  of  an  N-variate  data  set  can  be  termed  as  heterogeneous  if 
the  marginals  Zn  are  non-identically  distributed. 

In  the  following,  we  assume  that  the  marginal  PDFs,  fzn  (zn), 
are  known  and  the  goal  is  to  construct  the  joint  PDF  fz  (z)  of  the 
multimodal  random  vector  Z.  Further,  the  variables  Zn  exhibit  sta¬ 
tistical  dependence  so  that  fz(z)  14^=1  fzn.  (zn). 

Characterizing  multivariate  statistical  dependence  is  one  of  the 
most  widely  researched  topics  and  has  always  been  a  difficult  prob¬ 
lem  [1],  The  most  commonly  used  bivariate  measure,  the  Pearson’s 
correlation  p  captures  the  linear  relationship  between  variables  and 
is  a  weak  measure  of  dependence  when  dealing  with  non-Gaussian 
random  variables.  Two  random  variables  X  and  Y  are  said  to  be  un¬ 
correlated  if  the  covariance,  Y,x,y  =  E (XT)  —  E(W)E(U)  is  zero 
(p  =  0).  Statistical  independence  has  a  stricter  requirement  in  that 
X  and  Y  can  be  called  independent  only  if  their  joint  density  can  be 
factored  as  the  product  of  the  marginals.  In  general,  a  zero  correla¬ 
tion  does  not  guarantee  independence  (except  when  the  variables  are 
jointly  Gaussian). 

The  problem  is  further  compounded  when  dealing  with  a  mul¬ 
timodal  random  vector  such  as  Z  with  complex  inter-modal  inter¬ 
actions  between  the  component  variables  Zn  that  follow  disparate 
probability  distributions.  Thus,  the  derivation  of  multimodal  joint 
PDF  becomes  difficult  and  one  is  often  forced  to  assume  multivari¬ 
ate  Gaussian  or  inter-modal  independence  to  construct  a  tractable 
statistical  model.  A  multivariate  Gaussian  model  would  necessitate 
the  marginals  to  be  Gaussian  and  thus  would  fail  to  utilize  the  knowl¬ 
edge  of  the  given  marginal  PDFs.  Assuming  statistical  independence 
neglects  inter-modal  dependence  thus  leading  to  suboptimal  solu¬ 
tions. 

Alternatively,  we  propose  in  this  work,  a  copula  based  model  to 
represent  the  inter-modal  dependence  structure.  Copulas  are  func¬ 
tions  that  couple  multivariate  joint  distributions  to  their  component 
marginal  distribution  functions  [2],  [3],  The  main  advantage  of  the 
copula  based  approach  is  that  it  allows  us  to  define  inter-modal  de¬ 
pendence  irrespective  of  the  underlying  marginal  distributions.  One 
can  thus  construct  joint  distributions  with  arbitrary  marginals  and 
the  desired  dependence  structure.  This  property  is  well  suited  espe¬ 
cially  for  the  joint  processing  of  multimodal  variables  with  different 
marginal  distributions. 

Sklar  (1959)  was  the  first  to  define  copula  functions. 
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Sklar’s  Theorem  and  its  Implications 

Theorem  1  (Sklar’s  Theorem ) 

Let  Fz(zi,  Z2,  ■  ■  ■  zn)  be  the  joint  cumulative  distribution  func¬ 
tion  (CDF)  with  continuous  marginals  Fz^zf),  Fz2(z2),  ■■■  , 
Fzn  ( Zn )•  Then  there  exists  a  copula  function  C(-)  such  that  for  all 
0i,Z2,  •••  ,zn  in  [—00,00], 


Fz(z\,  22,  ■  ■  ■  zn)  =  C(Fz1  (21),  Fz2  (22),  ■  ■  ■  ,Fzn(zn ))  (1) 

For  continuous  marginals,  C(-)  is  unique;  otherwise  C'(-)  is 
uniquely  determined  on  RanFz1  x  RanFz2  •  •  •  x  RanFzN  where 
RanX  denotes  the  range  of  X.  Conversely,  if  C(-)  is  a  copula 
and  Fz1{z\),  Fz2{z2),  ■■■,  Fzn(zn)  are  marginal  CDFs  then 
the  function  Fz(-)  in  (1)  is  a  valid  joint  CDF  with  the  marginals 
Fz1(zi),  Fz2  (22),  ■  •  ■  ,FZn(zn )• 

Note  that  the  copula  function  C(ui,  112,  ■  ■  ■  ,un)  is  itself  a  CDF 
with  uniform  marginals  as  un  =  Fzn(zn)  ~  U(0, 1)  (by  probabil¬ 
ity  integral  transform). 

The  copula  based  joint  PDF  of  N  continuous  heterogeneous  ran¬ 
dom  variables  can  now  be  obtained  by  taking  an  Nth  order  derivative 
of(l), 


fz(z)  = 

=  f£(z) 


AT  \ 

fz„(zn)  1  c(FZl(zi),  ■  ■  ■  ,FZn(zn )) 

71  =  1  / 


(2) 


where  Z  =  [Z\,  Z2,  ■  ■  ■  ,  Zn]  and  we  use  the  superscript ' c!  to  de¬ 
note  that  f|(z)  is  the  copula  representation  of  fz(z).  Note  that  we 
need  to  know  the  true  copula  density  function  ’c(-)’  to  have  an  exact 
representation  as  in  (2).  We  emphasize  here  that  any  joint  PDF  with 
continuous  marginals  can  be  written  in  terms  of  a  copula  function  as 
in  (2)  (due  to  Sklar’s  theorem).  However,  identifying  the  true  copula 
is  not  a  straightforward  task.  A  common  approach  then  is  to  select  a 
copula  function  a  priori  and  fit  the  given  marginals  and  the  desired 
dependence  structure  to  derive  the  joint  distribution. 

Several  copula  functions  have  been  defined  especially  in  the 
econometrics  and  finance  literature  (e.g.  [4]);  the  popular  ones 
among  them  being  multivariate  Gaussian  copula,  Student’s  t  copula 
and  copula  functions  from  the  Archimedean  family.  Given  a  copula 
density  function  k(-)  and  the  marginal  distributions,  the  joint  PDF 
estimate  then  has  the  form  similar  to  (2), 


fz(z)  =  (n^w)  MFZi(^-,FzAzn)) 


=  fz(z) 


(3) 


The  dependence  structure  between  the  marginals  is  completely 
captured  by  the  copula  function  and  is  separate  from  the  choice  of 
the  marginals.  Next,  we  describe  the  joint  PDF  construction  using 
copulas. 


2.1.  Joint  PDF  Construction  using  Copulas 

As  an  example,  consider  two  random  variables  Z\  and  Z2  associated 
with  two  sources  of  different  modalities.  Given  a  copula  function 
K(.),  we  wish  to  construct  a  copula  based  bivariate  density  function 
of  the  form  as  in  (3).  Table  1  lists  some  of  the  well  known  bivariate 
copulas.  Each  of  these  functions  is  parameterized  by  ’9’  that  controls 
the  ’amount  of  dependence’  between  the  two  variables.  Thus,  it  is 


Table  1.  Copula  functions 


Copula 

C(m,u2) 

Kendall’s  r 

Gaussian 

$jv[$-1(ni),$-1(ii2);0] 

—  arcsin  (0) 

Clayton 

k9+^29-i  r® 

e 

e+2 

Frank 

Mooli  1  (e-""1— 1)) 

1  4 

i  i  re  *  m 

0  ioS  1  e-O-l  ) 

1  6 

1  e  Jo  e^lal 

Gumbel 

exp 

-  {(-log«i)e  +  (  log 7/2) 9 } 1/9 

i-l 

Product 

U1.U2 

0 

required  to  estimate  9  from  the  acquired  bivariate  observations.  We 
describe  how  this  is  done  using  nonparametric  dependence  measures 
as  this  method  is  computationally  efficient. 

The  copula  dependence  parameter  9  can  be  expressed  as  a  func¬ 
tion  of  Kendall’s  r,  a  nonparametric  measure  of  association  between 
two  random  variables  [2J.  Specifically,  Kendall’s  r  measures  the 
concordance  between  two  random  variables.  Let  (21(1),  22(1))  and 
(21  (j),  Z2(j))  be  two  observations  from  a  bivariate  measurement 
vector  (Zi,  Z2)  of  continuous  random  variables.  The  observations 
are  said  to  be  concordant  if  (21(1)  —  zi(j))  (22(1)  —  22(7))  >  0 
and  discordant  if  (21  (?)  -  21  (j))  (22(f)  -  22 (j))  <  0. 

The  population  version  of  Kendall’s  r  can  be  expressed  in  terms 
of  K(.)  as 

tzi,z2  =  4  II  K (u  1 ,  U2 ;  9)dK (wi ,U2\9)  —  1  (4) 


where  un  =  Fzn(z„).  Thus,  for  a  given  r,  the  integral  equation 
above  can  be  used  to  solve  for  9.  Table  1  shows  the  relationship 
between  r  and  9  for  some  of  the  well-known  copula  functions. 

When  r  is  unknown,  9  can  be  obtained  front  the  sample  estimate 
f.  Given  L  i.i.d  measurements  (zi(l),Z2{l))i  ( l  =  1,2,- ,L), 
the  observations  are  rank  ordered  and  f  can  be  computed  as 


tzuz2 


c  —  d 
c  +  d 


(5) 


where  c  and  d  are  the  number  of  concordant  and  discordant  pairs 
respectively. 


2.2.  Joint  PDF  Construction  assuming  inter-modal  indepen¬ 
dence 

Joint  PDF  estimate  assuming  inter-modal  statistical  independence  is 
given  as  the  product  of  the  marginals,  i.e.. 

N 

fz(z)  =  f z„(zn)  =  fz'(z)  (6) 

n= 1 

where  we  use  superscript  ’m’  to  denote  that  the  joint  PDF  estimate 
depends  on  the  marginal  independence  assumption. 

Thus,  both  joint  PDF  estimates  (3)  and  (6),  capture  the  given 
maginal  densities.  The  copula  based  joint  PDF  estimate  further  cap¬ 
tures  Kendall’s  r,  the  rank  correlation  between  the  variables. 

We  next  formulate  a  binary  hypotheses  testing  problem. 


3.  HYPOTHESES  TESTING 

A  decision  theory  problem  consists  of  deciding  which  of  the  hy¬ 
potheses  Ho,  ,  Hk  is  true  based  on  the  acquired  observation 
vector  of  (say)  L  samples.  An  optimal  test  (in  both  the  Neyman- 
Pearson  (NP)  and  Bayesian  sense)  for  a  two  hypotheses  problem 
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(Ho  vi.  Hi)  computes  the  log-likelihood  ratio  (A)  and  decides  in 
favor  of  Hi  when  the  ratio  is  larger  than  a  pre-defined  threshold  (77), 


A(z) 


fz(z|Ho)  Ho 


(7) 


fz(z|H  i)  is  the  joint  PDF  of  the  random  observation  vector  z  = 
[zi,  •  •  •  ,  zn]T  £  under  the  hypothesis  Hi,  (i  =  0,  1)  and  in¬ 
cludes  all  the  statistics  required  to  derive  (7).  In  the  NP  set  up,  the 
threshold  'rj’  is  selected  to  constrain  the  false  alarm  error  probability, 
Pp  to  a  value  a  <  1  and  at  the  same  time  minimize  the  probability 
of  miss,  Pm-  The  two  error  probabilities  are  given  as 


PF  =  P(A  >  J7|H0),  Pm  =  P{ A<77|Hi)  (8) 


Consider  a  binary  hypotheses  testing  problem  where 


3.3.  Performance  Analysis  using  Error  Exponents 

The  asymptotic  performance  of  a  likelihood  ratio  test  can  be  quan¬ 
tified  using  the  KL  distance,  D  (fz(z|Hi)||fz(z|Ho)),  between 
the  PDFs  underlying  the  two  hypotheses.  For  ’L’  i.i.d.  N- 
variate  measurements  Z;  =  [Zi,  Z2  ■  •  ■  Zn]i,  (l  =  1,2,  ,L ), 
through  Stein’s  Lemma  [5],  we  have  for  a  fixed  value  of  Pm  =  /3, 
(0  <  P  <  1), 

lim  ^logPF  =  -D(fz(z|Hi)||fz(z|H0))  (13) 

L — »oo  J_j 

The  greater  the  value  of  D  (fz(z|Hi)||fz(z|Ho)),  faster  is  the  con¬ 
vergence  of  Pf  to  zero  as  L  — >  00.  The  KL  distance  is  thus  indica¬ 
tive  of  the  performance  of  a  log-likelihood  ratio  test.  Further,  it  is 
additive  for  independent  observations, 


Hi  :  fz(zi,Z2,---  ,Zjv|Hi) 

N 

Ho  :  fz(zi,Z2,---  ,zjv|H0)  =  fz„(z„|H0)  (9) 

n= 1 

Thus,  it  is  known  that  the  random  variables  Zi,  ■  ■  ■  ,Zn  are  statisti¬ 
cally  independent  under  the  hypothesis  Ho.  However,  the  joint  dis¬ 
tribution  fz(zi,  Z2,  ■  ■  ■  ,  zjv|Hi)  under  hypothesis  Hi  is  unknown. 
We  use  copula  theory  to  estimate  fz  (zi ,  Z2 ,  ■  ■  •  ,  zn  |  Hi )  and  derive 
the  log-likelihood  ratio  test. 


N 

D  (fz1(z|Hi)||fz(z|H0))  =  ^£>(fzn(zn|Hi)||fzB(z„|H0)) 

71=1 

(14) 

where  D  (f zn  (zn|Hi)||fzn  (zn|Ho))  is  the  KL  distance  for  a  single 
modality  Zn. 

Assuming  the  knowledge  of  the  true  underlying  copula  c(-),  we 
have  the  following  theorem  to  compare  the  asymptotic  performances 
of  HLRT  and  MLRT  for  the  binary  hypotheses  testing  problem  for¬ 
mulated  in  (9). 


3.1.  Heterogeneous  log-likelihood  ratio  test 

We  use  (3)  to  estimate  fz  (zi,  Z2 ,  •  •  •  ,  zn  |  Hi )  in  terms  of  marginals 
and  derive  the  copula  based  heterogeneous  log-likelihood  ratio  test 
(HLRT)  statistic  as  below, 


Ak(z) 


iog^z!Hi!=iogfzk(z|Hl) 


fz(z|H0) 


fz(z|Ho) 


1  (  TT  fz„(Zn|Hl) 

los  (n  sjsiiE)  1  + 

log  ^(Pljzi),--- ,Fzn(zn))\  (11) 


(10) 


where  the  superscript  V  in  Flz  (zn  )  denotes  the  CDF  of  Zn  under 
hypothesis  i. 

It  can  be  seen  from  (11)  that  the  copula  function  allows  one  to 
exactly  factor  out  the  role  of  cross  modal  dependence  from  the  strate¬ 
gies  employed  by  the  individual  modalities.  This  allows  to  quantify 
performance  gains  (if  any)  achieved  due  to  inter-modal  dependence. 


3.2.  Marginal  log-likelihood  ratio  test 


It  is  interesting  to  note  the  form  of  the  test  statistic  in  (11).  The  first 
term, 


Am(z)  =  log 


(fj  fz„(gn|H1)\ 

v7=i  fz"  (z™  I  Ho )  J 


(12) 


is  the  test  obtained  when  dependence  between  the  variables  Z\,  Z2, 
■  •  • ,  Zjv  is  neglected.  We  call  this  test  the  marginal  likelihood  ra¬ 
tio  test  (MLRT).  In  problems  where  the  derivation  of  the  joint  den¬ 
sity  becomes  mathematically  intractable,  tests  are  usually  employed 
assuming  independence  between  variables  conditioned  on  each  hy¬ 
pothesis. 

We  next  compare  performances  of  HLRT  and  MLRT  detectors. 


Theorem  2  The  KL  distance  between  the  two  competing  hypotheses 
(Ho  vs.  Hi)  increases  by  a  factor  equal  to  multiinformation  (under 
Hi)  when  dependence  between  the  variables  is  taken  into  account. 


D  (fz (z| Hi ) 1 1 fz (z| H0 ) )  -  D(f^(z|Hi)||fz(z|H0)) 

=  I(Zi;  Z2;  ■  ■  ■  ;  Zn)  (15) 

V.  ^ 

>  0 


Multiinformation  [6]  is  defined  as 

T(Z1- Z2r--;ZN)  =  [  fz(z)  log  (  fz^z)  )  dz 

A  \Hi=i  J 

(16) 

This  is  intuitively  satisfying  as  the  multiinfomiationT(-)  (which 
reduces  to  the  well-known  mutual  information  when  N  =  2)  de¬ 
scribes  the  complete  nature  of  dependence  between  the  variables. 
Our  result,  that  the  error  exponent  increases  due  to  inter-modal  de¬ 
pendence,  agrees  with  Koval  et.  al.  [7],  where  the  authors  use  the 
logarithmic  inequality  (log(a:)  >  1  —  T)  to  prove  the  result.  Our 
approach  is  based  on  the  copula  representation  of  joint  PDF  and  goes 
further  to  quantify  the  performance  loss  (=!(■))  when  multimodal 
signals  are  statistically  independent  or  when  the  dependence  is  de¬ 
liberately  neglected  for  simplicity. 

In  the  next  section,  we  present  a  multisensor  detection  exam¬ 
ple  and  compare  detector  receiver  operating  characteristics  (ROCs) 
when  100  i.i.d  sensor  measurements  are  available  in  each  decision 
window. 

4.  A  MULTISENSOR  DETECTION  EXAMPLE 

Consider  a  parallel  network  of  sensors  as  shown  in  Fig.  1  where  each 
of  the  deployed  sensors  may  have  different  sensing  capabilities.  The 
sensors  monitor  a  common  region  of  interest  and  the  goal  is  to  design 
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Fig.  1.  A  multisensor  detection  system  with  common  region  of  inter¬ 
est  (ROI).  Different  shapes  for  the  sensors  denote  different  sensing 
modalities. 


Fig.  2.  Monte  Carlo  based  ROCs  for  MLRT  and  HLRT  constructed 
using  different  copula  functions.  We  set  —  ctq  =  1  and  (3 1  = 
af  =  1.1. 


an  algorithm  that  can  combine  the  acquired  multimodal  information 
for  detecting  the  occurrence  of  a  random  event.  We  generate  a  syn¬ 
thetic  multimodal  data  set  and  show  simulation  results  for  N  =  2. 
In  the  future,  we  will  apply  this  approach  to  some  real  datasets  and 
evaluate  its  performance. 

Copula  models  allow  one  to  generate  multivariate  random  vec¬ 
tors  whose  properties  satisfy  definition  1  [2],  We  use  the  Student’s  t 
copula  with  degree  of  freedom  ( d.o.f )  four  and  Kendall’s  r  equal  to 
0.1  to  generate  bivariate  multimodal  data  under  the  hypothesis  Hi. 
The  marginals  follow  gamma  and  Gaussian  distributions  under  each 
hypothesis  (Hi;  i  =  0, 1), 


fZlOi|Hi) 


fz2(s2|H.) 


/Tr  («0 

1 


.2“’  1,  21  >  0  and  ai,(3i  >  0  (17) 


z.e  *  ,  — 00  <  22  <  00,  and  >  0 

(18) 


where  (3\  >  /3o  and  cr\  >  gq.  We  set  ai  =  ao-  While  a  bivari¬ 
ate  Gaussian  distribution  with  non-identical  means  and/or  variances 
would  satisfy  Definition  1  and  could  have  been  used  to  generate  the 
synthetic  data  set,  we  adopt  the  above  model  to  emphasize  that  the 
proposed  copula  based  methodology  for  multimodal  signal  process¬ 
ing  does  not  restrict  the  marginals  to  the  same  parametric  family  of 
distributions. 

Let  Ai  and  A2  denote  log-likelihood  ratio  tests  corresponding 
to  Z\  and  Z2  respectively.  We  plot,  in  Fig.  2,  the  ROCs  for  Ai, 
A2,  Am  (=  Ai  +  A2)  and  Ak  using  50,000  Monte  Carlo  trials.  As¬ 
suming  that  the  true  joint  PDF  (used  to  generate  data)  is  unknown 
we  construct  Ak  using  arbitrary  copula  functions  (from  table  1).  As 
evident  from  Fig.  2,  Am  performs  better  than  each  of  the  single 
sensor  test  statistic  as  expected.  However,  fusion  using  HLRT  out¬ 
performs  MLRT  as  the  copula  functions  (even  though  misspecified) 
capture  the  rank  correlation  between  the  multimodal  measurements 
and  hence  deviate  from  the  erroneous  inter-modal  independence  as¬ 
sumption. 


5.  CONCLUDING  REMARKS 

We  have  presented  a  copula  based  framework  for  multimodal  signal 
processing.  We  show  how  copula  functions  allow  us  to  model  the 
joint  PDF  of  multimodal  measurements.  We  also  present  a  multisen¬ 


sor  detection  example  where  the  use  of  copula  functions  is  advan¬ 
tageous  than  assuming  inter-modal  independence.  We  note  that  this 
may  not  be  true  in  general.  Derivation  of  a  universal  improvability 
condition,  selection  of  the  best  copula  and  the  incorporation  of  de¬ 
pendence  measures  other  than  the  rank  correlations  will  be  addressed 
in  the  future. 
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